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1 Abstract 

A neural network based system is presented 
which is able to generate point-to-point move- 
ments of robotic manipulators. The foun- 
dation of this approach is the use of proto- 
typical control torque signals which are de- 
fined by a set of parameters. The parameter 
set is used for scaling and shaping of these 
prototypical torque signals to effect a de- 
sired outcome of the system. This approach 
is based on neurophysiological findings that 
the central nervous system stores general- 
ized cognitive representations of movements 
called synergies , schemas , or motor programs. 
It has been proposed that these motor pro- 


grams may be stored as torque-time functions 
in central pattern generators which can be 
scaled with appropriate time and magnitude 
parameters. The central pattern generators 
use these parameters to generate stereotypi- 
cal torque-time profiles, which are then sent 
to the joint actuators. Hence, only a small 
number of parameters need to be determined 
for each point-to-point movement instead of 
the entire torque-time trajectory. This same 
principle is implemented for controlling the 
joint torques of robotic manipulators where 
a neural network is used to identify the rela- 
tionship between the task requirements and 
the torque parameters. Movements are spec- 
ified by the initial robot position in joint co- 
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ordinates and the desired final end-effector 
position in Cartesian coordinates. This in- 
formation is provided to the neural network 
which calculates six torque parameters for 
a two-link system. The prototypical torque 
profiles (one per joint) are then scaled by 
those parameters. After appropriate training 
of the network, our parametric control design 
allowed the reproduction of a trained set of 
movements with relatively high accuracy, and 
the production of previously untrained move- 
ments with comparable accuracy. We con- 
clude that our approach was successful in dis- 
criminating between trained movements and 
in generalizing to untrained movements. 

2 Introduction 

An important problem in space robotics is 
point-to-point control of the robotic arm 
end-effector in an unstructured environment. 
Many attempts have been made to solve this 
problem: the usual methods are tedious and 
computationally intensive to solve in real- 
time, even with the most advanced compu- 
tational methods ( [4], [11], [13]). This paper 
introduces a different strategy based on mo- 
tor control principles used by humans. 

In many studies on human movements, 
consistent and stereotypical hand and joint 
trajectories have been observed across move- 
ment speeds, extents, directions, and exter- 
nal loads. Such findings support the no- 
tion that movements are controlled by pro- 
totypical motor programs which are stored in 
the central nervous system and scaled to fit 
the requirements of each particular movement 
task before playback [l], [2], [5], [7], [12], 
[15], [16]. In particular, it has been proposed 
that these motor programs may be stored as 
muscle force-time functions and that differ- 


ent movements along the same path, but with 
varying speed or paylod, can be executed by 
playing back those functions with appropriate 
time and magnitude scaling. Therefore, the 
human motor system replaces the explicit cal- 
culation of the entire muscle-force profile by 
the calculation of just a few scaling parame- 
ters which are used to control central pattern 
generators (CPG) where the motor programs 
are stored. 

A problem emerging from the motor pro- 
gram concept is that, since an infinite number 
of possible movements exist, the nervous sys- 
tem must have some way to calculate or to 
look up an infinite number of possible scal- 
ing parameters. Recently, engineering solu- 
tions for similar problems have been intro- 
duced in the form of artificial neural networks 
(ANN’s [14]). Essentially, an ANN consists of 
processing elements, interconnection topolo- 
gies, and a learning algorithm governing the 
modification of connection strengths depend- 
ing on mapping performance. Generally, an 
ANN allows the mapping of input values into 
output values based on previously established 
mapping rules. These rules are determined 
via a repetitive trial- an d-error learning pro- 
cedure rather than by explicit calculations. 
An important characteristic of ANN’s is that 
once a correct mapping has been learned for a 
number of input values, the network can gen- 
eralize and provide correct output values even 
for untrained input values. Thus the above 
problem of representing an infinite number of 
parameters can be overcome by using neural 
networks to find suitable solutions. 

To summarize, control by motor programs 
appears to be potentially useful for manipula- 
tor control because the controller would only 
have to calculate a limited number of scal- 
ing parameters before movement onset rather 
than calculating the entire joint torque-time 
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profiles in real-time. This results in a robotic 
manipulator control system that can be re- 
ferred to as a Parametric Control System, and 
is presented here as a means of controlling 
the joint torques of a two degree-of-freedom 
planar robotic manipulator. Furthermore, 
this approach is used in conjunction with a 
neural network which identifies the relation- 
ship between the task requirements and the 
torque parameters. Therefore, the approach 
presented here combines the motor progiam 
concept with neural networks to determine 
the joint torque-time functions necessary to 
drive a robotic manipulator end-effector from 
an initial to a desired final configuration. 

3 Control Strategy 

The control problem is to move a two-link 
planar robotic arm, as shown in Figure 1, 
from an initial position to a desired final posi- 
tion within the workspace shown. The robot 
manipulator control system, which was used, 
is designed to utilize the benefits of the motor 
program concept, and is illustrated in Fig- 
ure 2. The adaptive controller is an ANN, 
trained to map inputs x d , 6 a into outputs 
p. The parameters p are applied to a func- 
tion generator which generates a prototypical 
time-function. This time-function is scaled 
by p to yield the force-time functions T d (t), 
one per joint, to be applied by the plant. In 
the work reported here, the plant is the Ra- 
dius robotic manipulator at the University of 
Toronto Institute for Aerospace Studies [3], 
This manipulator is a two degree-of-freedom 
planar manipulator with rigid links, where 
the links are supported by airjets in the hor- 
izontal plane. The airjets allow the Radius 
robotic manipulator to move in the horizontal 
plane without friction. The two joint actua- 
tors are harmonic-drive servomotors with the 


joint position 6 a being measured by precision 
potentiometers. 

The ANN was implemented using the 
structure shown in Figure 3. Each neuron is 
a logistic unit having a working range of - 1.0 
to + 1.0 with all of the neurons being fully 
forward-connected. Inputs to the ANN struc- 
ture are x d , the two Cartesian coordinates of 
the desired final gripper position, and 6 a , the 
actual initial angles of both joints, with 6 a 
and x d being sampled once before a move- 
ment. 

The input signals x d and 6 a pass through 
a layer of 127 coarse-coding neurons [6] (each 
neuron being tuned to a range of input val- 
ues with overlapping ranges for neighboring 
neurons), then through two hidden layers of 
20 units per layer (the first layer containing 
20 neurons and the second layer containing 
20 neurons) and finally through a layer of six 
output neurons. The output signals provided 
by the last layer represent the values of the 
six parameters p which were previously de- 
scribed. Three of these parameters are used 
for each joint, p x to p 3 for the shoulder joint 
and pi to p 6 for the elbow joint. 

The parameters p serve as inputs to 
the Function Generator (Figure 2), which 
in turn provides two output signals T d (t), 
one for each joint, which are applied to the 
plant. Both output signals are triggered syn- 
chronously when p changes after a new x d has 
been entered, and each output signal consists 
of two successive sinusoidal half-waves hav- 
ing an overall duration of 4 sec. Figure 4 
illustrates that p x and p., represent the per- 
centage of movement time taken by the first 
lobe of the two torque profiles, one for each 
joint, and p 2 ,P3,P5, and Pe represent the max- 
imum torque amplitudes for each lobe of the 
two torque profiles. 
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The function of the ANN is, essentially, 
to map four discrete input signals x^d a for 
the two joints into six discrete output signals 
p 1 ,p 2 ? and P 3 (the torque parameters for the 
shoulder joint) and p 4 ,p 5 , and p 6 (the torque 
parameters for the elbow joint). Only a single 
mapping action per movement is needed. The 
modifiable ANN weights are adjusted in order 
to achieve satisfactory mapping by a modi- 
fied version of Direct Inverse Modeling [8], a 
known training procedure. 

In this modified Direct Inverse Model- 
ing training procedure, the initial Radius 
joint positions 6 a are first noted and an op- 
erator then moves the gripper into a se- 
lected final position x s along an approxi- 
mately straight path with an approximately 
bell-shaped, single-peaked velocity profile of 
4 second duration. The joint trajectories 6 {t) 
during that movement are recorded on a disk 
and subsequently transformed into predicted 
joint torque profiles T v {t) using Radius’s In- 
verse Dynamics equations. Next, predicted 
joint torque profiles T p {t) are approximated 
by two sinusoidal half-waves of variable rel- 
ative duration and amplitude and the corre- 
sponding parameters p are noted. Then, Ra- 
dius having been reset to 0 ai p is provided as 
inputs to the function generator which sup- 
plies outputs Trf(i) to the actuators in order 
to drive Radius to a final position noted as 
Since the parameterization is only an approx- 
imation, Td(t) and T p (t ) will be somewhat dif- 
ferent and Xd will be somewhat different from 
x s . 

The noted values of xj, 0 a and p character- 
ize one movement of a training set. The above 
steps are repeated for 225 different move- 
ments of various amplitudes and directions 
within the workspace shown in Figure 1 to 
yield a set of 225 training movements char- 
acterized by their respective values of .t^, 9 a 


and p. 

Training of the ANN commences by ini- 
tializing the modifiable weights with random 
values. Then x<i and 6 a of the training set 
are used as the ANN inputs and the corre- 
sponding outputs p are recorded. The differ- 
ence between p as calculated by the ANN and 
the corresponding p as noted for the train- 
ing set is the ANN performance error and 
is used for incremental weight changes ac- 
cording to the backpropagation rule. ANN 
performance is considered satisfactory when 
the output values pi to p 6 , which are applied 
to the function generator, result in a grip- 
per movement to the desired final position x ^ 
such that x a & xj. 

4 Results 

An illustrative representation of network per- 
formance is given by Figure 5, where the fi- 
nal position error of the end-effector is plot- 
ted. The errors are coded as lines from the 
actual final position to the desired final po- 
sition. Performance before training is shown 
in Figure 5A, and after training (10,000 itera- 
tions) in Figure 5B. As can be seen, the errors 
between the desired and actual final end ef- 
fector positions are greatly reduced. In fact, 
the average error drops from 0.75 m before 
training, to 0.03 m after training: in compar- 
ison, the robotic arm is 2.12 m long. There- 
fore, the error after learning was almost an 
order of magnitude smaller than the inter- 
target distances which ranged from 0.1 m to 
0.85 m. Thus, the system was able to dis- 
criminate between targets. Figure 5C shows 
the final position errors of the trained neural- 
network controller using a set of movements 
that were not previously trained. As can be 
seen, the average final position error of 0.07 
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m was slightly higher than the trained data 
set, but was again less than the shortest in- 
tertarget distance. Therefore, generalization 
within the workspace was successful. 

5 Conclusions 

We have described a solution to the control 
of point-to-point movements of a two joint 
planar robotic arm. This parametric control 
concept is qualitatively different from tradi- 
tional approaches described earlier. Instead 
of explicitly calculating the torques for the 
entire trajectory, the new concept specifies 
only a limited number of characteristic pa- 
rameters. In addition, the control system 
presented in this paper provides the follow- 
ing advantages over most other known types 
of systems: 

1. No explicit knowledge of the manipula- 
tor’s dynamics is required. 

2. The nonlinear (ANN) stage is not in a 
control loop which will avoid any prob- 
lems due to computational delays of 
the type generally caused by nonlinear 
stages. 

3. The ANN can be easily retrained for 
different robotic manipulators and/or 
changing robot dynamics. 

4. The design of a controller for a multi- 
link robotic manipulator with n > 2 is 
not qualitatively different than that de- 
scribed in this paper since the nonlin- 
ear stage is designed by trial-and-error 
rather that by an analytical solution. 

In addition, our control concept discrimi- 
nates between targets, and generalizes to un- 
learned target positions. This parametric 


control should be particularly useful for real- 
time robot control in unstructured environ- 
ments since only a limited number of vari- 
ables need to be updated, therefore placing 
less of a computational burden on the con- 
troller. Moreover, our control concept may 
be improved to achieve a more accurate ter- 
minal approach to the targets by the addi- 
tion of sensory feedback, as found in humans. 
Also, this concept could be easily expanded 
to allow for velocity control by direct scaling 
of the torque profiles, and better control of 
movement paths could be achieved by adding 
more parameters (p,) . 
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Figure 1: Two-link planar manipulator and workspace (LI and L2 are the link lengths of the first 
and second links, where LI = L2 = 1.06 m). 
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Figure 2: Block diagram of the control system utilized, where solid lines represent time varying 
actions and hatched lines represent a single mapping actions per movement (Xd and X a are the 
desired and actual end-effector positions, 8 a the initial robot configuration in joint coordinates, 
P' t s are the torque scaling parameters, T\(t) and T 2 (t ) are the joint torque-time profiles for the 
shoulder and elbow joints, and Tj(t) represents the input torques to the plant). 
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Figure 3: Artificial neural network architecture used in the simulations reported here (n = 127). 
All neurons are fully forward- connected to the neurons in the layers in front. 
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Figure 4: Torque parameterization scheme employed. Where p \ , and p 4 are the time of switching 
from the first lobe to the second lobe for torque profiles T\ and T2 respectively, P2 and ps are 
the amplitudes of the first lobe, and p 3 and p 6 are the amplitudes of the second lobe for torque 
profiles Tj and T 2 . 
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Figure 5: Graphical illustration of the final position errors from the actual to the desired final 
end-effector positions (A - final position errors before training, B - final position errors after 
training, for same workspace as A, C - final position errors for an untrained data set, for same 
workspace as A also). 
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